R has a rich package ecosystem for read/writing, manipulating, and analyzing geospatial data.
Some core packages:
sp - core classes for handling spatial data, additional utility functions - Deprecated
rgdal - R interface to gdal (Geospatial Data Abstraction Library) for reading and writing spatial data - Deprecated
rgeos - R interface to geos (Geometry Engine Open Source) library for querying and manipulating spatial data. Reading and writing WKT. - Deprecated
sf - Combines the functionality of sp, rgdal, and rgeos into a single package based on tidy simple features.
raster - classes and tools for handling spatial raster data.
stars - Reading, manipulating, writing and plotting spatiotemporal arrays (rasters)
See more - Spatial task view
sfThis is the hardest part of using the sf package, difficulty comes from is dependence on several external libraries (geos, gdal, and proj).
Windows - installing from source works when Rtools is installed (system requirements are downloaded from rwinlib)
MacOS - install dependencies via homebrew: gdal2, geos, proj.
Linux - Install development packages for GDAL (>= 2.0.0), GEOS (>= 3.3.0) and Proj.4 (>= 4.8.0) from your package manager of choice.
More specific details are included in the README on github.
sf
st_read() / st_write() - Shapefile, GeoJSON, KML, …read_sf() / write_sf() - Same, supports tibbles …st_as_sfc() / st_as_wkt() - sf <-> WKTst_as_sfc() / st_as_binary() - sf <-> WKBst_as_sfc() / as(x, "Spatial") - sf <-> spSimple feature collection with 100 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -84.32186 ymin: 33.84175 xmax: -75.46003 ymax: 36.58815
Geodetic CRS: NAD83
# A tibble: 100 × 9
AREA PERIM…¹ COUNT…² STATE COUNTY FIPS STATE…³ SQUAR…⁴
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <dbl>
1 0.112 1.61 1994 NC Ashe … 37009 37 429.
2 0.0616 1.35 1996 NC Alleg… 37005 37 236.
3 0.140 1.77 1998 NC Surry… 37171 37 539.
4 0.0891 1.43 1999 NC Gates… 37073 37 342.
5 0.0687 4.43 2000 NC Curri… 37053 37 264.
6 0.119 1.40 2001 NC Stoke… 37169 37 456.
7 0.0626 2.11 2002 NC Camde… 37029 37 241.
8 0.115 1.46 2003 NC Warre… 37185 37 444.
9 0.143 2.40 2004 NC North… 37131 37 551.
10 0.0925 1.81 2005 NC Hertf… 37091 37 356.
# … with 90 more rows, 1 more variable:
# geometry <MULTIPOLYGON [°]>, and abbreviated variable
# names ¹PERIMETER, ²COUNTYP010, ³STATE_FIPS, ⁴SQUARE_MIL
Simple feature collection with 940 features and 16 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -176.646 ymin: 17.70156 xmax: -64.80172 ymax: 71.28545
Geodetic CRS: NAD83
# A tibble: 940 × 17
AIRPRTX…¹ FEATURE ICAO IATA AIRPT…² CITY STATE STATE…³
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 0 AIRPORT KGON GON GROTON… GROT… CT 09
2 3 AIRPORT K6S5 6S5 RAVALL… HAMI… MT 30
3 4 AIRPORT KMHV MHV MOJAVE… MOJA… CA 06
4 6 AIRPORT KSEE SEE GILLES… SAN … CA 06
5 7 AIRPORT KFPR FPR ST LUC… FORT… FL 12
6 8 AIRPORT KRYY RYY COBB C… ATLA… GA 13
7 10 AIRPORT KMKL MKL MC KEL… JACK… TN 47
8 11 AIRPORT KCCR CCR BUCHAN… CONC… CA 06
9 13 AIRPORT KJYO JYO LEESBU… LEES… VA 51
10 15 AIRPORT KCAD CAD WEXFOR… CADI… MI 26
# … with 930 more rows, 9 more variables: COUNTY <chr>,
# FIPS <chr>, TOT_ENP <dbl>, LATITUDE <dbl>,
# LONGITUDE <dbl>, ELEV <dbl>, ACT_DATE <chr>,
# CNTL_TWR <chr>, geometry <POINT [°]>, and abbreviated
# variable names ¹AIRPRTX010, ²AIRPT_NAME, ³STATE_FIPS
Simple feature collection with 233 features and 3 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: -7472582 ymin: 2911107 xmax: 2443707 ymax: 8208428
Projected CRS: NAD83 / UTM zone 15N
# A tibble: 233 × 4
ROUTE_NUM DIST_MILES DIST_KM geometry
<chr> <dbl> <dbl> <MULTILINESTRING [m]>
1 I10 2449. 3941. ((-1881200 4072307, -187992…
2 I105 20.8 33.4 ((-1910156 5339585, -191013…
3 I110 41.4 66.6 ((1054139 3388879, 1054287 …
4 I115 1.58 2.55 ((-1013796 5284243, -101313…
5 I12 85.3 137. ((680741.7 3366581, 682709.…
6 I124 1.73 2.79 ((1201467 3906285, 1201643 …
7 I126 3.56 5.72 ((1601502 3829718, 1602136 …
8 I129 3.1 4.99 ((217446 4705389, 217835.1 …
9 I135 96.3 155. ((96922.97 4313125, 96561.8…
10 I15 1436. 2311 ((-882875.7 5602902, -88299…
# … with 223 more rows
sf structuresf [100 × 9] (S3: sf/tbl_df/tbl/data.frame)
$ AREA : num [1:100] 0.1118 0.0616 0.1402 0.0891 0.0687 ...
$ PERIMETER : num [1:100] 1.61 1.35 1.77 1.43 4.43 ...
$ COUNTYP010: num [1:100] 1994 1996 1998 1999 2000 ...
$ STATE : chr [1:100] "NC" "NC" "NC" "NC" ...
$ COUNTY : chr [1:100] "Ashe County" "Alleghany County" "Surry County" "Gates County" ...
$ FIPS : chr [1:100] "37009" "37005" "37171" "37073" ...
$ STATE_FIPS: chr [1:100] "37" "37" "37" "37" ...
$ SQUARE_MIL: num [1:100] 429 236 539 342 264 ...
$ geometry :sfc_MULTIPOLYGON of length 100; first list element: List of 1
..$ :List of 1
.. ..$ : num [1:1030, 1:2] -81.7 -81.7 -81.7 -81.6 -81.6 ...
..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
- attr(*, "sf_column")= chr "geometry"
- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA
..- attr(*, "names")= chr [1:8] "AREA" "PERIMETER" "COUNTYP010" "STATE" ...
sf classesCoordinate Reference System:
User input: NAD83
wkt:
GEOGCRS["NAD83",
DATUM["North American Datum 1983",
ELLIPSOID["GRS 1980",6378137,298.257222101,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4269]]
Coordinate Reference System:
User input: NAD83 / UTM zone 15N
wkt:
PROJCRS["NAD83 / UTM zone 15N",
BASEGEOGCRS["NAD83",
DATUM["North American Datum 1983",
ELLIPSOID["GRS 1980",6378137,298.257222101,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4269]],
CONVERSION["UTM zone 15N",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-93,
ANGLEUNIT["Degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
ID["EPSG",26915]]
We have observed counts of lip cancer for 56 districts in Scotland. Let \(y_i\) represent the number of lip cancer for district \(i\).
\[\begin{aligned} y_i &\sim \text{Poisson}(\lambda_i) \\ \\ \log(\lambda_i) &= \log(E_i) + x_i \beta + \omega_i \\ \\ \boldsymbol{\omega} &\sim N(\boldsymbol{0},~\sigma^2(\boldsymbol{D}-\phi\,\boldsymbol{A})^{-1}) \end{aligned}\]
where \(E_i\) is the expected counts for each region (and serves as an offset), \(\boldsymbol{D}\) is a diagonal matrix of neighbor counts, and \(\boldsymbol{A}\) is the adjacency matrix.
car_model = "
data {
int<lower=0> N;
int<lower=0> p;
int<lower=0> y[N];
matrix[N,N] A;
matrix[N,p] X;
vector[N] offset;
}
transformed data {
vector[N] nb = A * rep_vector(1, N);
matrix[N,N] D = diag_matrix(nb);
}
parameters {
vector[N] w_s;
vector[p] beta;
real<lower=0> sigma2;
real<lower=0,upper=1> phi;
}
transformed parameters {
vector[N] eta = log(offset) + X * beta + w_s;
}
model {
matrix[N,N] Sigma_inv = (D - phi * A) / sigma2;
w_s ~ multi_normal_prec(rep_vector(0,N), Sigma_inv);
beta ~ normal(0,10);
sigma2 ~ cauchy(0,5);
y ~ poisson_log(eta);
}
"# A tibble: 56 × 6
.variable i mean q25 q75 sd
<chr> <int> <dbl> <dbl> <dbl> <dbl>
1 eta 1 1.63 1.40 1.88 0.363
2 eta 2 0.913 0.649 1.19 0.400
3 eta 3 2.30 2.14 2.47 0.247
4 eta 4 3.62 3.52 3.73 0.157
5 eta 5 0.721 0.509 0.947 0.333
6 eta 6 2.21 2.06 2.38 0.247
7 eta 7 2.64 2.49 2.80 0.233
8 eta 8 2.09 1.89 2.31 0.311
9 eta 9 2.72 2.57 2.89 0.233
10 eta 10 2.91 2.77 3.06 0.213
# … with 46 more rows
(car_lip_cancer = lip_cancer %>%
mutate(i = seq_len(n())) %>%
left_join(
car_pred %>% filter(.variable == "eta"),
by = "i"
) %>%
mutate(y_pred = exp(mean)))Simple feature collection with 56 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 7150.759 ymin: 529557.2 xmax: 468393.4 ymax: 1218479
CRS: +init=epsg:27700 +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +datum=OSGB36 +units=m +no_defs +ellps=airy +towgs84=446.448,-125.157,542.060,0.1502,0.2470,0.8421,-20.4894
First 10 features:
District id Observed Expected pcaff Latitude
1 Sutherland 12 5 1.8 16 58.06
2 Nairn 13 3 1.1 10 57.47
3 Inverness 19 9 5.5 7 57.24
4 Banff-Buchan 2 39 8.7 16 57.56
5 Bedenoch 17 2 1.1 10 57.06
6 Kincardine 16 9 4.6 16 57.00
7 Angus 21 16 10.5 7 56.75
8 Dundee 50 6 19.6 1 56.45
9 NE Fife 15 17 7.8 7 56.30
10 Kirkcaldy 25 19 15.5 1 56.20
Longitude i .variable mean q25 q75
1 4.64 1 eta 1.6311044 1.3963210 1.8827401
2 3.98 2 eta 0.9131655 0.6487173 1.1882329
3 4.73 3 eta 2.2966728 2.1352521 2.4702543
4 2.36 4 eta 3.6213156 3.5167991 3.7266633
5 4.09 5 eta 0.7206699 0.5091857 0.9473425
6 3.00 6 eta 2.2148004 2.0552073 2.3847897
7 2.98 7 eta 2.6439220 2.4885623 2.8049091
8 3.20 8 eta 2.0947916 1.8945332 2.3092128
9 3.10 9 eta 2.7245066 2.5715914 2.8863322
10 3.30 10 eta 2.9099389 2.7678132 3.0555052
sd geometry y_pred
1 0.3629014 MULTIPOLYGON (((254301.9 96... 5.109515
2 0.3996257 MULTIPOLYGON (((292629.6 85... 2.492199
3 0.2468279 MULTIPOLYGON (((214097.9 84... 9.941052
4 0.1568925 MULTIPOLYGON (((351078.2 86... 37.386722
5 0.3326180 MULTIPOLYGON (((239274.9 77... 2.055810
6 0.2472608 MULTIPOLYGON (((395569.4 80... 9.159580
7 0.2333714 MULTIPOLYGON (((379822.4 76... 14.068271
8 0.3112501 MULTIPOLYGON (((357538.2 73... 8.123748
9 0.2334651 MULTIPOLYGON (((324911.9 71... 15.248888
10 0.2130545 MULTIPOLYGON (((343283.3 70... 18.355678
Sta 344 - Fall 2022